A Bayesian Model of Human Sentence Processing
نویسندگان
چکیده
Language comprehension is a classic problem of reasoning under uncertainty. Language comes to us as a noisy, unsegmented, ambiguous mass of auditory waveforms or visual stimuli. Humans must somehow combine this input with other knowledge we have to come up with reasonable interpretations and actions. How might humans address this problem of decision-making under uncertainty? The best normative model we have for solving problems of this sort is probability theory, which offers a principled method with a coherent semantics for weighing and combining evidence. Whether this normative model is the correct descriptive model for all of human behavior has recently been the subject of much debate (Kahnema; Gigerenzer). While this debate is not resolved for all areas of human cognition and reasoning, the last decade or so had produced emerging consensus throughout the cognitive sciences that in some areas human cognition is likely to make use of probabilistic models. The seminal work of Anderson (1990) gave Bayesian underpinnings to cognitive models of memory, categorization, and causation, and recent Bayesian models of human cognition include work in human visual processing (Rao et al. 2001; Weiss & Fleet 2001), categorization (Tenenbaum, 2000; Tenenbaum & Griffiths, 2001b, 2001a), and the human understanding of causation (Rehder, 1999; Glymour & Cheng, 1998). Together, these ideas suggest that perhaps the process of human language comprehension is also best modeled as a process of probabilistic, Bayesian reasoning. This idea that human processing of language draws on probabilistic models is hardly novel. (Schuchardt, 1885), in his arguments against the 19th century Neogrammarians, point out that key role of frequency in language production and language change. Schuchardt noted that word frequency is a good predictor of which words are phonologically weakened or ‘lenited’. Words which are more frequent tend to be shorter and phonologically simplified; (Zipf, 1929) pointed out that this reduction of frequent forms also happened for frequent phones. (Jespersen, 1922) expanded Schuchart’s idea from pure frequency to predictability or probability. Jespersen pointed out that the predictability of the word in its context, in addition to its raw frequency, must play a factor in the phonological form of the word. These early intuitions about frequency and probability were all related to language production. Evidence for the role of frequency and probability specifically in language comprehension processing dates quite a bit later, from the mid 20th century. In the 1950’s, for example, Davis Howes showed that word frequency plays a key role in comprehension in both the visual and auditory domains (Howes & Solomon, 1951; Howes, 1957). Throughout the second half of the 20th century, evidence amassed that high frequency words are accessed more quickly, they are accessed more easily, and they are accessed with less input signal than low-frequency words. This is a very robust effect, supported by tachistoscopic recognition Howes and Solomon (1951), naming (Forster & Chambers, 1973), lexical decision (Rubenstein, Garfield, & Millikan, 1970; Whaley, 1978; Balota & Chumbley, 1984), recognition accuracy and errors in noise (Howes, 1957; Savin, 1963), and gating (Grosjean, 1980). The last two decades of behavioral research have extended these lexical results to other areas of psycholinguistics such as sentence processing. We know that many kinds of probabilistic knowledge play a role in the comprehension of sentences. One such factor is the probability of the different lexical categories of a word. For example the a priori probability that the ambiguous word fires is a noun, or alternatively a verb, plays a role in sentence comprehension, as does the probability that the word selected is a preterite or a participle (Burgess & Hollbach, 1988; Trueswell,
منابع مشابه
Joint Bayesian Stochastic Inversion of Well Logs and Seismic Data for Volumetric Uncertainty Analysis
Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical) seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior informat...
متن کاملA Bayesian Model Predicts Human Parse Preference and Reading Times in Sentence Processing
Narayanan and Jurafsky (1998) proposed that human language comprehension can be modeled by treating human comprehenders as Bayesian reasoners, and modeling the comprehension process with Bayesian decision trees. In this paper we extend the Narayanan and Jurafsky model to make further predictions about reading time given the probability of difference parses or interpretations, and test the model...
متن کاملSentence Processing Among Native vs. Nonnative Speakers: Implications for Critical Period Hypothesis
The present study intended to investigate the processing behavior of 2 groups of L2 learners of English (high and mid in proficiency) and a group of English native speakers on English active and passive reduced relative clauses. Three sets of tasks, an offline task, and 2 online tasks were conducted. Results revealed that the high-proficiency group’s performance was the same as that of the nati...
متن کاملFirst Language Activation during Second Language Lexical Processing in a Sentential Context
Lexicalization-patterns, the way words are mapped onto concepts, differ from one language to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...
متن کاملLanguage Modeling With Dynamic Bayesian Networks Using Conversation Types and Part of Speech Information
In this paper we investigate whether more accurate modeling of differences in language in different types of conversations, e.g. formal presentations vs. spontaneous conversations can improve the quality of a language model. We also investigate whether the modeling of sentence lengths can improve a language model. A language model is an important component of statistical natural language proces...
متن کاملRisk Analysis of Operating Room Using the Fuzzy Bayesian Network Model
To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998